Efficient Entity Resolution on Heterogeneous Records
نویسندگان
چکیده
منابع مشابه
Efficient Entity Resolution on Heterogeneous Records
Entity resolution (ER) is the problem of identifying and merging records that refer to the same real-world entity. In many scenarios, raw records are stored under heterogeneous environment. Specifically, the schemas of records may differ from each other. To leverage such records better, most existing work assume that schema matching and data exchange have been done to convert records under diff...
متن کاملEfficient Entity Resolution with MFIBlocks
Entity resolution is the process of discovering groups of tuples that correspond to the same real world entity. In order to avoid the prohibitively expensive comparison of all pairs of tuples, blocking algorithms separate the tuples into blocks which are highly likely to contain matching pairs. Tuning is a major challenge in the blocking process. In particular, contemporary blocking algorithms ...
متن کاملEntity Resolution on Complex Network
Complex networks can be used to describe the Internet, social network, or more broadly describe a binary relation of a set of objects. Structure information of complex network helps the identification of the entity corresponding to nodes in the network. There is much research in this area, and the authors introduce these studies and their results in this chapter. The authors mainly present two ...
متن کاملEfficient Interactive Training Selection for Large-Scale Entity Resolution
Entity resolution (ER) has wide-spread applications in many areas, including e-commerce, health-care, the social sciences, and crime and fraud detection. A crucial step in ER is the accurate classification of pairs of records into matches (assumed to refer to the same entity) and non-matches (assumed to refer to different entities). In most practical ER applications it is difficult and costly t...
متن کاملThe Effect of Transitive Closure on the Calibration of Logistic Regression for Entity Resolution
This paper describes a series of experiments in using logistic regression machine learning as a method for entity resolution. From these experiments the authors concluded that when a supervised ML algorithm is trained to classify a pair of entity references as linked or not linked pair, the evaluation of the model’s performance should take into account the transitive closure of its pairwise lin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering
سال: 2020
ISSN: 1041-4347,1558-2191,2326-3865
DOI: 10.1109/tkde.2019.2898191